Abstract
Background: Hemorrhagic cystitis (HC) is a frequent and debilitating complication after allogeneic hematopoietic stem cell transplantation (allo-HSCT) in patients with thalassemia major (TM), with limited tools available for early identification. This study aimed to identify clinical risk factors for HC and construct a predictive model using both conventional and machine learning (ML) approaches.
Methods: We retrospectively analyzed TM patients who underwent allo-HSCT at the First Affiliated Hospital of Guangxi Medical University from Oct 2020 to Apr 2023. Risk factors were screened using least absolute shrinkage and selection operator (LASSO) regression and univariate area under the curve (AUC) analysis, with variables intersected via Venn diagram. Logistic regression was used to construct a clinical nomogram incorporating the selected predictors. Model performance was evaluated through discrimination (AUC, C-index), calibration, and decision curve analysis in both training and testing cohorts. Additional machine learning models—including random forest, decision tree, support vector machine (SVM), neural network, XGBoost, and LightGBM—were developed to validate the nomogram's predictive value.
Results: A total of 222 patients (mean age 8.5 years, range 2–19; 60.4% male) were included, with 72 (32.4%) developing HC at a median onset of 27 days post-transplant. Five predictors were identified: age, serum ferritin, acute graft-versus-host disease (aGVHD), sepsis, and tacrolimus exposure. The nomogram yielded an AUC of 0.690 (95% CI: 0.595–0.785) in the training set, with a C-index of 0.690 and internal validation C-index of 0.647. Calibration curves showed good agreement between predicted and observed risks, and decision curve analysis demonstrated favorable clinical utility. In the testing set, AUC was 0.672 (95% CI: 0.531–0.812) with a consistent C-index. Among all ML models tested, logistic regression remained the most robust, with superior interpretability and performance.
Conclusion: This study identified key clinical risk factors for HC after allo-HSCT in TM patients and developed a machine learning–enhanced prediction tool. The nomogram model demonstrated stable performance and clinical utility, offering a potential strategy for early identification and intervention in high-risk patients.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal